Noise-robust hands-free voice activity detection with adaptive zero crossing detection using talker direction estimation

نویسندگان

  • Yuki Denda
  • Takamasa Tanaka
  • Masato Nakayama
  • Takanobu Nishiura
  • Yoichi Yamashita
چکیده

This paper proposes a novel hands-free voice activity detection (VAD) method utilizing not only temporal features but also spatial features, called adaptive zero crossing detection (AZCD), that uses talker direction estimation. It firstly estimates talker direction to extract two spatial features: spatial reliability and spatial variance, based on weighted cross-power spectrum phase analysis and maximum likelihood estimation. Then, the AZCD detects voice activity frames by robustly detecting zero crossing information of speech with adaptively controlled thresholds using the extracted spatial features in noisy environments. The experimental results in an actual office room confirmed that the VAD performance of the proposed method that utilizes both temporal and spatial features is superior to that of the conventional method that utilizes only the temporal or spatial features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise robust voice activity detection using features extracted from the time-domain autocorrelation function

This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second system uses a novel combination of cross-correlation and zero-crossing rat...

متن کامل

Voice Activity Detection based on Optima Multiple Featu

This paper presents a voice activity detection (VAD) scheme that is robust against noise, based on an optimally weighted combination of features. The scheme uses a weighted combination of four conventional VAD features: amplitude level, zero crossing rate, spectral information, and Gaussian mixture model likelihood. This combination in effect selects the optimal method depending on the noise co...

متن کامل

Evaluation of voice activity detection by combining multiple features with weight adaptation

For noise-robust automatic speech recognition (ASR), we propose a novel voice activity detection (VAD) method based on a combination of multiple features. The scheme uses a weighted combination of four conventional VAD features: amplitude level, zero crossing rate, spectral information, and Gaussian mixture model (GMM) likelihood. The weights for combination are adaptively updated using minimum...

متن کامل

Performance Analysis of Voice Activity Detection Algorithms for Robust Speech Recognition

The emerging applications of speech technology especially in the fields of wireless applications, digital hearing aids or speech recognition are often requiring a noise reduction technique in combination with a precise Voice Activity Detector (VAD). In this paper, we compare the performance of the VAD algorithms like Zero Crossing Detection(ZCD), Weak Fricative Detection (WFD), Pitch Based Dete...

متن کامل

Turn taking-based conversation detection by using DOA estimation

We propose a new method that detects conversation groups when multi-conversation groups exist simultaneously. The proposed method uses hands-free microphone arrays without wearable microphones. It has two main features: (a) We integrate a conventional turn taking-based conversation detection method with Direction of Arrival (DOA) estimation-based Voice Activity Detection (VAD). (b) The proposed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007